Digital Pathology Podcast
Digital Pathology Podcast
108: DigiPath Digest #14 (AI in Pathology: Case Prioritization, Kidney Biopsy Analysis and the Need for Consistent TIL Quantification).
In this 14th episode of DigiPath Digest, I introduce a new course on AI in pathology, designed to help pathologists understand and confidently navigate AI technologies.
The episode focuses on various research studies that highlight the integration and effectiveness of AI in pathology, particularly in colorectal biopsies and kidney transplant biopsies, emphasizing the importance of seamless workflow integration.
You will also learn about challenges in manual assessment of tumor-infiltrating lymphocytes and HER2 expression in breast cancer. I advocate for more consistent and precise AI-driven approaches.
And there an opportunity for a discounted beta test of the new AI course.
00:00 Welcome to DigiPath Digest #14
00:24 New AI Course Announcement
01:51 Deep Learning in Colorectal Biopsies
09:17 AI in Kidney Biopsy Evaluation
16:12 Automated Scoring of Tumor Infiltrating Lymphocytes
24:22 AI for HER2 Expression in Breast Cancer
31:13 Conclusion and Course Details
THIS EPISODE'S RESOURCES
📰 A deep learning approach to case prioritisation of colorectal biopsies
🔗 https://pubmed.ncbi.nlm.nih.gov/39360579/
📰 Galileo-an Artificial Intelligence tool for evaluating pre-implantation kidney biopsies
🔗 https://pubmed.ncbi.nlm.nih.gov/39356416/
📰 Automated scoring methods for quantitative interpretation of Tumour infiltrating lymphocytes (TILs) in breast cancer: a systematic review
🔗 https://pubmed.ncbi.nlm.nih.gov/39350098/
📰 Precision HER2: a comprehensive AI system for accurate and consistent evaluation of HER2 expression in invasive breast Cancer
🔗 https://pubmed.ncbi.nlm.nih.gov/39350085/
▶️ YouTube Version of this Episode:
🔗 https://www.youtube.com/live/jkT8dTxelt4?si=xT6MNH7O4HuUnAN6
📕 Digital Pathology 101 E-book
🔗https://digitalpathology.club/digital-pathology-beginners-guide-notification
🤖 "Pathology's AI Makeover" Online Course 50% OFF
🔗 Let me know that you are interested in LinkedIn (just 10 spots available)
Become a Digital Pathology Trailblazer get the "Digital Pathology 101" FREE E-book and join us!
Good morning. well. Welcome my digital pathology trailblazers to our 14th DigiPath digest. I'm going to wait for you to join the live stream. And whenever you're here. You just let me know that you're here. I'm going to wait for you a little bit and give you a few announcements before we start. So what I did create last week is a new course on AI in Pathology. So this new AI course is gonna take you from being maybe a little bit overwhelmed with all the new stuff that's coming in, actually understanding what is AI, how can it be applied Two, being confident with the different technologies and with the concepts. It consists of audio commentary, YouTube videos, papers, and original content as well. I did not calculate how many hours it is, but all to gather, it's probably less than five. I don't know. It is comprehensive enough, but it's not, you can go through it in one weekend. So it's not overwhelming, but there is enough content. So I'm still putting together all the pages, but if you're interested, I'm going to be looking for 10 people to better test it. At a reduced price at 50% which currently I have it at 99. So for 10 people, it's going to be 50%. If you would like to better test it for half of the price, let me know, put course 50 in the comments. And I'm going to send you the link whenever I have it ready. Probably today the course is ready, but all the like payment stuff is not ready yet. So let us dive. into our topic today, which, as always, is AI in Pathology. So we're going to start with deep learning approach. To case prioritization of colorectal biopsies, and this was published in Histopathology, and the impact factor of this journal is 3. 9. And I was looking at it, oh, this is a group from Dublin here, from Ireland, and then UK, and somebody from Toronto. And then I looked at the. The name of this classifier, we're not going to skip, don't worry, but I just want to tell you the story, how I realized who wrote these papers, and this is triagnexia that sounds very much like diagnexia, I know people from diagnexia, and sure enough Diagnexia and the Syphax are part of this publication, and it was used the work was done with their system, and I looked at, so the first author is Chiara D. White, but I looked at other authors, and I'm like, oh, Jenny, I worked with Jenny, Adam, I worked with Adam. I know Pierre and Donald, who's the CEO of the Cyfix. So it was funny to see a publication because I work with them in the preclinical setting on the drug development part of their work. But this was for diagnostics. Let's see what they did what was this triage tool, and what they tried to do, they tried to create and validate a weakly supervised AI model for detection of abnormal colorectal histology. So they included dysplasia and cancer so that they could prioritize biopsies according to clinical significance, severity of diagnosis. That brings the question, oh, you're going to triage the high cases because they're like more important? Are negative cases less important? No, that's not what the prioritization tool is for. It's more for managing the cognitive burden on the pathologist who's diagnosing because if you know that there is, so negative cases is significantly less cognitive burden on the pathologist. Like when a patient is faced with the possibility of actually being diagnosed with cancer when they get the biopsy, they don't know if it's going to be negative or positive, right? To them, the result is the main thing. So it doesn't really matter what the result is. They're waiting for it. But from the workflow management perspective for the pathologist, the negative cases are less big. cognitively burdensome. I don't know if burdensome is a word, but they require less cognitive effort because positive require most more effort. So if you can manage the cases and match it to your workflow, for example, that's already a big big gain, right? So what happened here they had a lot of images. 24, 983 digitized images assessed by multiple pathologists in a simulated digital pathology environment. And this, application was implemented as part of point and click graphical user interface. I assume it's a simple one, point and click. This is also very beneficial for pathologists cognitive load on the cases. Because the less they have to you know, figure out how to use the software or do multiple things in the software, the better. And pathologists assess the accuracy of AI tool, its value, ease of use, and integration into the digital pathology workflow. I cannot emphasize enough this integration. So you may have the most fantastic tool on the planet. If it is Preventing you from doing your work in the flow, in the workflow that you have, you're not going to be using it. If it's slowing you down, if it makes you like think or spend the cognitive effort more, it's just not well integrated. You're not going to be using it. It has to be integrated so that you can keep the flow. And what were their results? They had 100 single slide cases, and they achieved microaverage model specificity of 0. 984, microaverage model sensitivity 0. 949, and there's also microaverage F1 score of 0. 949 across all classes. So here we need to talk about the micro average way of doing this. Usually so what did they, what class, sorry, what classes did they have here? Just, They included dysplasia and cancer. We didn't know exactly what classes, but basically when you calculate those metrics, you usually not usually, there are different ways. One of the ways is the way that we have here, but often you calculate them per class. So you have a very well performing class and maybe you can have a class that's not performing that great. And often it's a class that's under represented. So something that there were a few examples of so to assess the overall. Performance of this model. You, instead of doing this per class, you take all the instances together and calculate an average for that, and that is the micro average. There's also macro average here. They did the micro. All the classification instances went together into one score. And pathologists reflected their positive impressions on the overall accuracy and the AI in detecting colorectal pathology abnormalities. Conclusion, high performing colorectal biopsy AI triage model Can be integrated, we emphasize, can be integrated. We do want integration into a routine digital pathology workflow to assist pathologists in prioritizing cases so they have developed, right? That's the conclusion. It can be integrated. I assume you can just get it from them and use it which is fantastic. Let me know if you have any comments to that, any questions. Obviously, a prerequisite, prerequisite, I don't know, I'm using difficult to pronounce words for me today. Prerequisite for this is Digitization is that digital pathology workflow, right? This is implemented into routine Digital pathology workflow. So if you don't have a digital pathology workflow, then obviously you cannot use any of these digital tools and if you're joining me right now, let me know where you're tuning in from I'm looking forward to hearing from you in the comments and also if you're interested in the AI course. I'm going to talk a little bit more about this course at the end. Let me get to the papers that I picked for you for today. This one, Galileo, an artificial intelligence tool for evaluating pre implantation kidney biopsies. And this is Journal of Nephrology Impact Factor 1.809 Group from Italy, first author, Albino Ecker and Albino et al, oh, there's also part of the group. The authors are from Pittsburgh and some are from colon, Germany. So what happens here? We have kidney transplants, right? And you have to check if the kidney that you're gonna transplant into somebody else is okay. And I love this name. I didn't know about this name. Preplan transplant procurement Biopsy interpretation is challenging. This word procurement biopsy. Okay, and then they, started checking. Basically, this is the biopsy of the kidney of the person that died to check if this is actually a healthy kidney. It's challenging. Interpretation is challenging and because there are multiple things that you have to assess and Each one of them is not that difficult to recognize, but when you combine them and we have to, when you have to count them, you basically are making a huge bottleneck for the interpretation. And, but also there's a low number of renal pathology experts. I looked at those images. Lower number of renal pathology experts, I guess every time there's an expert, there's a lower number. So in this case, those procedures are going to be taking part in hospital settings, and then you have to give it to a pathologist. And if you don't have one, then digital pathology would be a fantastic way of doing it. But if you have one, they need to interpret kidneys. So here they present Galileo AI tool designed specifically to assist the on call pathologist. So they have an on call pathologist with interpreting pre implantation kidney biopsies. I'm thinking here, and that basically demonstrates that I would need to learn about the workflow more because if we're working with H and E images, then this biopsy is going to be processed for several hours before it can be looked at. So that puts something that you're going to learn about in a second into perspective, but let's have a look. There were whole slide images acquired from core needle and wedge biopsies of the kidney, and then a deep learning algorithm was trained to detect main findings evaluated in the pre implantation setting. And what are these main findings? It's going to be normal glomeruli, globally sclerosed glomeruli, so these are, so normal are normal, sclerose are not normal, they're sick, and they have different appearances, right? ischemic glomeruli. arterioles, which are not gromeruli, which are vessels and arteries. So they're already one, two, three, four, five components that need to be assessed probably in a systematic way or a structured way. And you know what they used? They used Aforia create platform and Aforia, I'm super happy whenever I see somebody I worked with or somebody who's the digital pathology play sponsor of my educational platform about digital pathology and Aforia is a sponsor. They have been with us for Three years since we actually started sponsorships. Aphoria Create, and it was validated on an external dataset by three independent pathologists to evaluate the performance of the algorithm. And they actually have this validation feature in the software you like can recruit pathologists even if they don't have never worked with aphoria and they can access it from the browser and do the validation. And we have this Galileo had precision sensitivity and F1 score of um, 81. 96, 94. 39, 87. 74, and also, sorry, total area error. I looked at the numbers before and I'm like, what's this 2. 81? What is this? That's the total area error. So the error was low, but I'm like, why is it so low? Because I didn't read the full sentence. Reading takes you a long way. And then in the validation set, they were a little bit lower, but still above 70, and the error was 2%. Galileo was significantly faster than pathologists, requiring only 2 minutes Versus 25, 22, and 31 minutes by three separate humans. And that is where my thing into perspective comes into play. how I imagined this, that you, you have this transplant setting where it have to be. It has to be done A S E P and those minutes count, which they do, but you still need to do H N E. Or maybe you do frozen, see, I don't know, I should have checked that in the paper. Maybe it's faster thing than H N E, but they don't say it in the abstract. They just say whole slide images, usually you do it for FFPE, for Welling Fixed Paraffin Embedded and not for frozen, but you can do it for frozen. Anyway, the faster you can image, whichever method of imaging you get, the faster this algorithm can run, and The nice thing that I liked a lot, Galileo assisted detection of renal structures and quantification information was directly integrated into final report. There are questions always, oh, like, how do you report it? They figured out how to report it. You can check this paper to see how. to incorporate AI results into the final report. When I always see and the conclusion is it shows promise. I'm like, what do you mean? Shows promise. I want a conclusion that says, oh, we're implementing from now and we're using it because it speeds us up and it shows promise. That's okay. That's what we have, right? Speeding up pre implantation kidney biopsy interpretation and reducing inter observer variability. This is always a great a great advantage of of AI, right? Of any image analysis. It reduces inter observer variability. And you can be really a lot faster in terms of the evaluation. And we are talking about the reduced inter observer variability, which is a perfect segue to the next paper. Titled, Automated Scoring Methods for Quantitative Interpretation of Tumor Infiltrating Lymphocytes Teals in Breast Cancer, Teals in Breast Cancer as Systematic Review. So this is gonna just tell you, okay, what has been done so far. And this is a group from Malaysia and, some, there is Department of Computer Science from the UK. But, why am I passionate about TILs? Alex, why do you even care about TILs? You do preclinical, right? There was, and let me tell you the story. There is a story about TILs. At some point, PathPresenter, which is a collaborator of Digital Pathology as well, and a platform for viewing slides like an image management system, but on steroids for different things. And they have collaborations and they had a collaboration with the FDA and The goal of this collaboration was to put a lot of breast cancer slides and have pathologists be more concordant on evaluating kills. And I'm like, why do you want people to be more concordant on guesstimation? Because They're the we are not counting these things, we're estimating the amounts, and we're inherently physiologically bad at estimating things in an image, and computers are very good in counting those things. It has not really anything to do with the Path Presenter and this initiative. It's the General concept when I hear that, Oh, let's make pathologists more concordant by estimating the amount of something in the tissue by estimating the amount of positive staining. I'm like, just let's not do it anymore. Let's have image analysis do it for me. It's okay, instead of a printer, I'm going to start writing stuff down cause Why? No, don't do it. Or instead of an Excel spreadsheet, I'm just gonna write it write all the math down and calculate it on my own because I'm so good at it. And the computer is not like totally the opposite, right? So it's the same problem where you, we didn't have the option to do it with image analysis before, but now we have. So like, yeah. Stop doing it without image analysis in one way or another. You don't have to be fully digitized. You can do it from the microscope. Just use it as a tool as you can. IHC is a mandatory tool and all the molecular ways of diagnosing are mandatory tool for everything that pathologists need to count, I would mandate it. Obviously I cannot mandate anything, but I can tell you my personal opinion on the evaluating of kills. So let's move into the paper now, after my rant on guesstimation in pathology. Let me know what you think about that. I bet I may get some polarizing comments of on this statement. That's okay. Tumor microenvironment of breast cancer mainly comprises malignant stromal immune and tumor infiltrating lymphocytes, TILs. Assessment of TILs is crucial for determining the prognosis and outcome. Especially if something is crucial for determining the prognosis, like use a tool that's precise. And as we all know from my rant, manual till assessment are hampered by multiple limitations. Low precision, poor inter observer reproducibility, and time consumption. Have we heard that before? Yes, probably like every single paper that describes the use of AI tools for quantification of something in the in the tissue. But I guess we need to drive this point home over and over again. Automated scoring emerges as promising approach. Not promising, no, the only approach you should use. This review, it's a literature review, presents a comprehensive compilation of studies related to automated TILS scoring and sourced from databases, from literature databases, Web of Science, Scopus, ScienceDirect, and PubMed, which is from where I get the alerts from. And These databases the keywords were used, artificial intelligence, breast cancer, and tumor infiltrating lymphocytes. And there are specific frameworks for doing this type of studies. One of the framework is. PECOS or PYCOS, and I have a definition of it, so P is population or problem, I intervention, C comparison or control, and O is outcome. So you analyze based on those categories and you pick your Your publications according to this framework, and then also then they wanted to have the reporting adhering to Prisma guidelines, which are guidelines from 2020 for reporting, and they found 1910 articles. Which is a lot of articles, right? And then they apply all these frameworks, and you know how many is left? 27! 27 is left. That's okay, we still have 27. And what did we learn from this? That those studies, concentration of studies on automated till assessment is in developed countries. And we need to remember that this paper is from Malaysia, so this is I think still considered developing country and correct me if I'm wrong in the comments, but basically developed countries like US and UK, which, no, we had a paper from Italy today. But basically what those algorithms published in those 27 studies were doing, they were doing semantic segmentation and object detection, and they used CNNs, Convolutional Neural Networks. So It's funny because the classical computer vision approaches used to be the rule based approaches. But now with the development of foundation model, I don't know how to call the CNN based very specific models. How about I just call them that? CNN based segmentation and object detection models. Let's use descriptive. And so those networks. They become the most frequent automated task and machine learning approaches applied for model development, respectively. And all models develop their own ground truth data sets. Maybe that was the point of the initiative of past presenter to develop a Database as well. I have to look that up. And if anybody from path presenter is listening to this, please leave a comment and say what you guys were exactly doing. But for example, when I see this, they all developed their own ground truth data sets. I'm like, this is not efficient. Everybody has to go and annotate, like, where are those annotations, give those annotations to people, or make them pay for annotations, because often you would like make people buy it from some repository, it probably would cost less than the hours that pathologists or scientists spend annotating these things. As you can see, I'm very sensitive about the time spent annotating because I don't see any reusability of all those hours of work going into all these projects, every single project of those. Probably, no, let's say we only were analyzing 27. They developed their own ground truth datasets for training and validation. and 59 percent assess the prognostic value of tills. In conclusion, it shows significant promise. I'm like significant promise. So yeah, that was TILS. And the same thing applies, we can do one more very quickly. The same thing applies to HER2 because HER2 is going to be about visual assessment of IHC. And what problems do we have in visual assessment of IHC? Let's see if they used similar wording. So this group is this was published as BMC Cancer. I think the previous one was also published in the same journal. Yes, I guess maybe they had a issue about breast cancer and part of this issue was covering AI. BMC cancer for HER2. And this is, this group is from China. What do they say? I want to see if they use, they are probably, there probably is like a specific wording that all the papers can use. Like, why is the manual, visual assessment by a pathologist challenging? But okay so the thing is here, we have novel anti HER2 targeted drugs. such as ADCs, which are antibody drug conjugates. And it has become increasingly important to accurately interpret HER2 expression in breast cancer. And guess how you don't do it accurately? by not using computers for that. I don't trust myself in adding multiple numbers. If it's more than five numbers, I use a calculator. And wait for this. Previous studies have demonstrated high intra observer and inter observer variabilities. Okay, they don't say it was slow, but not only Intra, also inter, meaning, okay, you look at something, then you take a rest for a week or two, you look at it again, and you say something else, because that is how this gestim, visual gestimation works. There exists. A strong requirement, I like this word, a strong requirement, but where is it put guidelines, some guidelines, anyway. To develop AI systems to achieve high precision HER2 expression scoring for better clinical therapy. I might have told you the score. story where I went with MD pathologists from Germany to a workshop, how to be more consistent in evaluating PD L1. And it's a similar story. HER2 is a membrane marker. PD L1 is a membrane marker in epithelial cells. So you basically estimate, okay, how much is there in the tissue? And for different targeted Therapies, there are different thresholds of positivity above which, or below which you don't get the drug and above which you do get the drug. And on the cutoff, like between, let's say the threshold for a drug would be 10%, and you would ask all those pathologists, oh, is it like, what is it? And half of the of the room. I don't know how many of us were there, like 100 people it was in Berlin at the Charité, a hundred people and 50 would say it's below, 50 would say it's above. So it's it's guessing, you guess whether this person is gonna get the drug or not. If it's like very high and very low and then, you have the concordance is higher, but all the time around the threshold this group of patients is basically like you toss a coin, right? That is a great way of deciding on treatment. Not. In this present study, we collected breast cancer tissue samples and stained consecutive sections with anti calponine and anti HER2 antibodies. So this calponine, we're going to say in a second what this is. And digital images were selected for immunohistochemical slides and interpreted as HER2. 2, 3 plus, 2 plus, and 1 plus, and 0. My question is here, this interpreted by who? Did you ask pathologists to interpret and then you compared other pathologists to the same, to, to these pathologists who were doing it? guesstimating as well, so study design questions here. And AI models were trained and assessed using annotated training and testing sets. Annotated, we will have to see what kind of annotation that was. And the model was trained to automatically identify ductal carcinoma in situ by Kalponin. So Kalponin was what was identifying ductal carcinoma staining and myo epithelial annotation and filter out this component in her two stained slides using image overlapping techniques. So they had two consecutive sections. That's what they say here. And they were overlapping one over the other one was stained with calponine and this calponine positive thing was then removed from this other image for analysis, which I think is good way of leveraging image analysis. And then in phase one, pathologists interpreted 112 HER2 whole slide images without assistance. And in phase two, pathologists read the same slides using AI after a washing period of two weeks. So they got two weeks rest. And what happened? AI improved the accuracy of reading. Oh, and nice. So with AI, it was 0. 9 and without it was 0. 7. The number of HER2 plus one patients misdiagnosed as HER2 0 was significantly reduced. Oh my goodness, so that's basically what I was talking around the threshold. So HER2 1 is gonna get the drug and HER2 0 is not gonna get the drug, right? So out of 279, the AI misdiagnosed 32, whereas pathologist 65. So it reduced the number of misdiagnoses, by half, right? Only, so pathologists misdiagnosed twice as much as the AA. So now all those patients that previously were not diagnosed as HER2 positive now benefited from ADC drugs, antibiotic drug conjugates. And also the algorithm improved the intra group consistency of HER2 readings by pathologists with different years of experience. And the improvement was most pronounced in junior pathologists. And what is our conclusion today? We proposed a high precision AI system to identify and filter the ductal carcinoma in situ components and automatically evaluate HER2 expression in invasive breast cancer. And I support this conclusion. I propose this too. I very much propose this. So these are all the papers today. And at the beginning, I mentioned that we have a new course AI. The name of this course is pathologies AI makeover, because there's a lot of AI coming into the pathology part of this we're discussing in this journal club. But there is more and there is more coming. Like, uh, Agentic ai slide glassless pathology. And so new concepts are coming into the mix and I created this course so that you can confidently navigate this landscape. And if you're interested in test driving it I am opening this to 10 people for half the price. The full price at the moment is 99. So it's gonna be for 49. If you're interested in the 50 percent discount or 50 discount, let me know in the comments. And I'm also gonna share this To my list. So if you're already on my list, you're going to learn about this. And this is open to 10 people and I will be done asking for your feedback. So if you're interested in the course, let me know if you want to, your mission, should you choose to accept it is to be a confident navigator of the AI landscape in pathology. And so if this is something you're looking for, let me know. And I talk to you in the next episode. Thank you so much for showing up.