Digital Pathology Podcast

116: DigiPath Digest #18 | Federated Learning in Pathology. Developing AI Models While Preserving Privacy

Aleksandra Zuraw Episode 116

Send us a text

In today's DigiPath Digest, we delve into federated learning, a decentralized approach to AI training that preserves data privacy.

I discuss recent papers from PubMed and share my experiences experimenting with AI tools like Perplexity and Gemini for research efficiency.

You will also get updates on upcoming plans, including leveraging AI to share more podcasts with you.

Did I mention that this is the last livestream of the year as I head to Poland for Christmas? No More DigiPath Digests. We got to number 18 (I overestimated it a bit in the podcast), and you have been instrumental in continuing this series!

Big THANK YOU to all the digital Pathology #TRLBLZRS showing up every Friday morning for this!

Join me as we tackle the nuances of federated learning and its impact on healthcare and pathology.

00:00 Introduction and Greetings
00:18 Today's Topic: Federated Learning
00:57 AI Tools and Updates
04:39 Federated Learning in Detail
08:03 Challenges and Benefits of Federated Learning
11:21 Exploring More Papers and Future Plans
22:53 Wrapping Up and Final Thoughts

Links and Resources:


Publications Discussed Today:

📝
Privacy-preserving federated data access and federated learning: Improved data sharing and AI model development in transfusion medicine
🔗https://pubmed.ncbi.nlm.nih.gov/39610333/

📝
A review on federated learning in computational pathology
🔗https://pubmed.ncbi.nlm.nih.gov/39582895/


If you enjoyed this episode, please subscribe and leave a review on your podcast listening App!

Support the show

Become a Digital Pathology Trailblazer get the "Digital Pathology 101" FREE E-book and join us!

Aleksandra:

Welcome my digital pathology trailblazers. How are you today? Welcome from Pennsylvania. 6 a. m. Time to do our weekly DigiPath Digest. When you are here, let me know in the comments. While i'm waiting for your comments i'm going to give you a few updates on other stuff. Today's topic is federated learning. when I got my PubMed alert, there were two papers about federated learning. I already talked to some people about it. Oliver Saldana wrote a paper about it, after the live stream, I'll link to the podcast with him and my, paper review, federated learning is a way of decentralized training of AI. when I talked to Oliver Saldana, He told me, and that's what we're going to learn in the papers today, that this is a way of, not having to aggregate data. we're going to dive into it in a second, but I wanted to give you a few updates about new AI tools that I have been, Experimenting with one of them is perplexity. It's not new, but, I was not using it before. I was using mainly, GPT Claude, just manually doing the, research first and pasting the, sources into the AI. I. got myself Gemini for my Google Drive, which I still don't know how to use. if you have any good tutorials on that, let me know in the comments. share a link to a good Gemini tutorial because I was trying to, work on it alongside my Google doc and just basically prompting me. It like chat GPT and it didn't work for me. So I need to do some education. but the new tool is perplexity. And I see people, liking us on Facebook. Great to have you on Facebook as well. If you're just joining the live stream, let me know in the comments that you're here. Even if you're viewing the recording, also say hi and comment on whatever you like so that more people can see this live stream, per perplexity, it's AI powered search. I was doing some research on foundation models for pathology I need to present on it at work today. and it's, shortened. the time of research significantly. It's my new best friend. I'm going to be exploring it. It also can, help you create pages. So I created a page about me. maybe in the description later, I'm going to leave the page about me and it pulls, sources. I mean, you have to verify. I came across some, kind of nonsense when I was, giving it a prompt about me. welcome, Colombia, amazing to have you, this is, High Yale Medical School, I think this is Eastern Time, so amazing to have you here. when I was researching myself, it pulls sources, but then somehow it, I don't know because it only happened one time I wrote Aleksandra Żuraw digital pathology and it created information about me that I am a prominent Python community programmer. No, that is not the case. I am not a programmer. I am now an avid voice prompter of different AIs. So, thank you. Perplexity, my new best friend. One reminder, if you are here for the live stream or if you have been for any other live stream, you can get a certificate of attendance and then submit it to your, continuous education records, in the U S that would be category two, you basically, Show the certificate or put it in your records, I will link in the description to which papers we were discussing today. and you can submit that So without further ado, these were my updates. one more update is that this is gonna be the last live stream for this year. I'm traveling back to Poland for Christmas soon. So, we're gonna stop the DigiPath digest for the year. I'm exploring a way to have AI assist me with generating short clips about those papers where I would be the intro and outro I tested it a little bit on the podcast. I don't know how people liked it. I'm going to test it for a more extended period and you let me know. If you like it, and if I see that there is, some interaction downloads listens to this, then I'm going to continue so that you can actually have this update more frequently, just once a week, we get a printout or The list of a lot of publications. We only managed to talk about a few so Let's not waste any more time privacy preserving federated data access and federated learning improved data sharing and AI model development in transfusion medicine. this publication was for transfusion medicine, but we are focusing on, the federated learning part. it doesn't really matter that it's not very pathology specific. data from different aspects of healthcare. Administrative, digital health and research oriented data. informed healthcare. patient's care and research, of course, and integrating AI into healthcare requires understanding of these data infrastructures. not to underestimate the data infrastructures. Was talking to Hamid Tizhoosh. He was a guest on the podcast. It's already recorded, not edited yet. So whenever it's out there, I'm going to let you know. I asked him, about federated learning? and he said, one of those challenges is not only for federated, but for any kind of data work is That, the data is not structured and not accessible for these, advancements. addressing these challenges such as data availability, privacy and governance. are general data challenges. He was talking, specifically about this being Accessible within an institution. federated learning is a decentralized AI Training approach, and it addresses these challenges, allowing models to learn from diverse data sets without data leaving its source. So what does it mean without data leaving its source? normally, the first, approach for training models, and it still is the approach for training those foundation models, you aggregate your data. In pathology, it's going to be slides and reports and, different things that go with the image data. If it's a multimodal model, sometimes it's only vision. when it's text and vision, you can have multimodality. And, of course, we're striving for even more multimodality across different medical specialties. But, normally you aggregate and that is the problem because you get all the data together. They're all together. All the information is together, and this can be done within one institution. If you want to pull data, there are all these data privacy sharing agreements. when I was talking to the leaders of the big picture initiative, a private, public consortium in Europe that is building a huge repository of slides to build models on. they said that putting together data sharing agreements took them over a year. So, That could be circumvented. How cool, right? instead of aggregating everything, you have this model that goes to all these data sources. So let's say we have five institutions, it's going to go to Institution 1, learn something there, Institution 2, Institution 3, Institution 4. it's not even like one centralized model that goes there. Like instances, the technicalities of this, would need to be explained by a computer vision specialist. But basically, you don't have the aggregation, you have the model going and learning from different things. these are called nodes. it learns from all these images and is wiser, quote unquote, than, It would be if it would only learn on data from one institution. Why do we want that other than all these privacy concerns and things like that? The diversity of data. here I have a question mark. Let me explain why I do it. Because for methods and discussion, this federated learning can offer significant benefits in transfusion medicine and in general in medicine and pathology, enhancing predictive analytics, personalized medicine and operational efficiency. Predictive models trained on diverse data sets by federated learning can improve accuracy in forecasting blood transfusion demand. That's why I have this question mark because I thought it was, at a higher level and not going to the data of a single patient, but maybe that can be the case as well. because here you can also create personalized treatment plans and they can be, refined. Aggregating patient data from multiple institutions using federated learning. not only, can you train a generalizable model by Sending it to different data sets. You can actually aggregate information about one patient Which currently in many institutions requires sending a fax some institutions still require you to send a fax what are the challenges? I see a lot of challenges. the challenges are with any kind of, automation, streamlining, even electronic health records in institutions, there's no communication between, these systems. of course the challenges will be data standardization, governance, and bias. if we only have, data from, one particular, Demographic or patient population, then it's going to be trained on that data and it's going to be giving information about this type of cohort, this type of, images or whatever you're training it on, bias is a challenge in AI research But they say federated learning represents a transformative approach to AI development in healthcare by leveraging diverse data sets while maintaining data privacy, federated learning has the potential to enhance predictions, support personalized treatments and optimize outcomes. ultimately improving patient care and healthcare efficiency. An interesting perspective for me is that I had associated it with creating a model that can serve multiple patients but here it says, this federated learning approach could get data from different institutions. And I am saying hi to Baltimore. Baltimore is so close. It's just one hour, 20 minutes from me. when the U. S. Cup conference was in Baltimore, everybody was like, Oh, Baltimore is not that fantastic of a city. I'm like, fantastic. I can drive and be home for dinner. So hi to Baltimore, Jason. Great to have you. So this federated learning and this is our paper number one. A couple of comments from my conversation with Hamid I asked him. about the federated learning? it's so difficult to aggregate those data to collect them all these foundation models are coming out and each of them is Trained on their data. How do they compare and we have an in depth discussion in the podcast? what he said it's Difficult to set up let's go to the next paper and at the end, I'm gonna read you all the, topics and maybe my personal preferences, there's a review of federated learning in computational pathology as well today. I asked him, What's the problem? I heard about it at the beginning of the year and there were publications. Is it not as fantastic as I thought? my initial thoughts are every new technology has the potential to revolutionize and then I start using it or learning more about it. And I'm like, surprise, surprise, it has limitations like every other technology. Like, I don't know why. I keep being surprised, but that's, the beauty of life, you know, technology surprises me. There's no one technology solves all, um, but that's okay. So, uh, about that, no technology, no one technology solves all. but that's okay. Yesterday, when I was putting together this, foundation model presentation, and I'm gonna, modify it and present it here, Make a video for youtube as well. So you're gonna get access to it I think I paid like a hundred dollars in different subscriptions to get access to all these pro tools like perplexity pro Some new gamma for presentations. I was using tome.App, which I already pay for. Yesterday, I bought a monthly subscription for a different one, basically testing different things and seeing, what is easiest for me to use because I refuse to have a steep learning curve right now, let's focus on federated learning. he says it's difficult to set up. let's see what this group from Zurich says they wrote a review on federated learning in computational pathology and published it in computational structure biotechnology journal This October. So, what happened here? I have some thoughts about it. in this review, they say, training generalizable computational pathology is dependent on large scale multi institutional data. Yes, it is. If we have multi institutional data, that would be the idea. different, populations represented, different patients, everything different so that we avoid bias. but we have strict data privacy rules hindering the creation of large data sets. And federated learning is addressing this dilemma. It's allowing separate institutions to collaborate in training process while keeping each institution's data private and exchanging model parameters instead. Although I need to do some more investigation because there was another paper saying that these models that check just the parameters. Actually, have an option to identify patients from it. So I'll ask perplexity. Oh, and we have people joining from Massachusetts. Hello. they analyzed 15 studies. I learned about at the beginning of the year. they wanted to explore this emerging technology for computational pathology applications. they, saw proof of concept studies the important thing was, from my conversation with Oliver. I don't think I talked about it with Hamid, the thing was, what's the performance when you, do aggregated versus federated learning, federated learning. Does have the central node a swarm learning was the one that doesn't even have a central node It's a subset of federated learning and so here just a little update and but the question is Will they perform the same are they going to be good enough the important thing was performance equivalency between models trained in a federated manner compared to a centralized manner. the performance was comparable. And To facilitate broader real world environment adoption, it is imperative to establish guidelines for the setup and development of federated learning infrastructure. in pathology, we would benefit from guidelines, and blueprints for setting up digital pathology labs. there are some in publications, but whenever there is something new even if it's a great, thing to start doing, the infrastructure is a challenge, it would benefit from the promotion of standardized software networks as well, these steps are crucial to further democratize, computational pathology, allowing smaller institutions to pool data and computational resources and investigate rare diseases. So obviously when the prevalence is not too high, the funding is a challenge. And then conducting multi institutional studies, allowing rapid prototyping on private data. democratize here, this democratize, right? I was thinking about democratization of AI when I was paying all those subscriptions yesterday. And I'm in a position to just buy them, right? Use them. And if I don't like it, I cancel. even if I already paid for the whole month, but not everybody is in that position. So, I did literally have that thought about, oh, how is this? these are paid resources that for somebody in the U. S. may not be that high of a cost, but for people, in different countries or, different stages of their career, they are super limiting They basically follow the freemium business model where they show you. the capabilities, and if you like it and want this edge in your research and efficiency, you pay money I paid money to be more efficient yesterday to prepare this presentation This thought about democratization, came to my mind. So, whenever resources are needed, you always pay. You either pay with your time or you pay with your money. for these types of tools, you have to pay with money. if you don't you're not going to have access to those tools. that's that. let's look at what else we have in terms of papers, I'm going to tell you what I'm planning to experiment with. Keep me accountable. if I don't come up with this in the near future, send me a comment like, You said this is going to be on the podcast, where is it? I don't see it. You did motivate me to finish my book when I was writing the digital pathology one on one. it took me, of course, so much longer and, I will be updating the AI chapter next year. So, keep me accountable for that as well. It's like, I want to buy the book, but I want the chapter to be updated. So let me know. and let's look at what else we have today. Bibliometric and Visual Analysis of Radiomics for Evaluating Lepnod Status in Oncology. bibliometric, it's going to be Literature Research. I don't know if I'm going to be reading this one. Maybe not. This one we already tackled. This one is interesting. Virtual Scalable Model of the Hepatic Lobule for Acetaminophen Hepatotoxicity Prediction. I might, investigate. I'm gonna make a star here. The ones that I'm gonna be exploring further. So let me tell you what I want to do. feel free to keep me accountable in the comments and in messages asking where it is I already tested this tool. It's called Google Notebook ML. And you can, give it a source and it's going to make you a short podcast. what I was experimenting with, I would, record myself, Aleksandra Zuraw, the real one, no AI, record the intro and outro. I would definitely listen to this. So it's not going to be me throwing at you some automated content, but I would listen to it. I would frame it on both ends of the podcast, and you would get a conversational version of the paper. Let me know if this, Aleks plus AI approach is interesting for you. Just write the comment, Alex plus AI, and then I'm gonna, go through these papers and generate some of these for the podcast, and then you can basically Listen to it. Binge listen to it. and there is a question here, smart working. Could I receive a certificate, of attendance? Yes. definitely. You're going to get the certificate. this hepatotoxicity is interesting to me, then, interesting. Electronic health records based identification of newly diagnosed? Crohn's disease cases. That is interesting. Don't you put this new diagnosis in the health care records? What this is about? What they did here? I clearly don't understand the title Anyway, memory efficient, on the fly tiling of histological images annotations using Qupath oh, yes, we must read this one because it's annotations, which I have several videos about if you want to learn how to annotate for supervised. I'm going to link it below. And QPath, there's a video about QPath. So, oh, this is a Polish last name. Where's this group from? Germany. Okay. Close to Poland. Enough. so we're going to check that one. And there's a review. I should read this one. AI in healthcare. Yeah, let's read it. And where is it? Biomedicine reports, biomedical reports. Oh, it's already for the collection of January, guys. Here we are already in 2025. We did federated learning. Artificial intelligence applications in lymphoma diagnosis and management, opportunities, challenges in future directions, multidisciplinary healthcare. I'm super interested in multidisciplinary. Little star here, little star. Basically, it's going to be, oh, Aleks is going to read everything. Well, maybe not. Oh, this one, but this one I have to read, the comparative analysis of tertiary lymphoid structures for predicting survival of colorectal cancer, whole slide image based study. this is precision clinical medicine. this is a group from China? there is a definition of what these tertiary lymphoid structures are, these are aggregates of lymphocytes outside of lymph nodes But they make these aggregates that look a little bit like lymph nodes to confirm this definition, you need to stain them with IHC. it wasn't that obvious on H&E which were and which weren't, the tertiary lymphoid structures according to the definition. So yeah, we need to put a star here, huge star. Proteomics analysis reviews, non-small cell cancer sub as predicting chromosome instability in tumor microenvironment. I don't know. When I hear proteomics, maybe I'm gonna skip this one. I'm skipping proteomics. Sorry. Mm. Okay. And I see some people Aleks plus AI sounds interesting. So I'm gonna do this experiment and you're gonna let me know, what you think. And then we see we continue or we don’t continue it also depends on the bandwidth and here we're starting over. So that was it for today If you have any questions any If you need a certificate certificate You can let me know in the comments if you're on linkedin I can send this link to you through a dm if you're on YouTube or another platform, I might not be able to directly send it to you, so I'll comment asking you to DM me on LinkedIn, or, send me your email the best way to get emails from me is to go to digital pathology place.com and get the book that I was showing you. Let me show you where, Yes. if you go to my website and click on this button here that says digital pathology one on one. you can get the free pdf version that gets you on my mailing list. once you're on my mailing list, you get all the content I'm creating. You get emails about these DigiPath digest Live streams at the beginning, people were shocked by the number of emails. But now they join when they can, If they don't join. I send a recording to them as well. And whenever I have the bandwidth, you will also have this as audio. to get all that info and be able to send me emails, download this book of mine, which is fantastic. later when I update the AI chapter, if you already have the PDF version, you will get the updated version for free. It's also, you know, this is paper, so this can be bought on Amazon. and if you like audiobooks, there is an audio version as well. I will say happy new year and happy holidays because, this is gonna be our last live stream this year, so thank you so much. So this DigiPath Digest, you guys are amazing. love to you, to all the digital pathology trailblazers We had Thanksgiving recently and I visited Path Visions I always go to conferences and take selfies with people. I felt so grateful that there are so many people interested in this niche topic. One person in the comments when I was posting shorts from something niche, commented, Who the heck is even interested in this? let me tell you, at least, those people who are subscribed to my social media, but Path of Visions had 800 followers. 50 people, attending this year and, you know, it's, difficult to attend the conference. You have to buy a hotel. You have to pay for this conference and all this different stuff. yet 850 people decided that this is worth their while. And I felt like, hey, we just have a strong tribe of digital pathology trailblazers that are learning about it, that they're showing up at 6am or whatever time it is every week. So thank you for that. And this particular series, DigiPath Digest, has been in my head and me not actually doing it for a lot longer than we've been doing it. today is, episode 19. or 20. we have been doing this for around 20 weeks at the beginning I was not sure if anybody was going to show up but consistently every week I have some people listening to this when I look at the social media sites a lot of people are viewing the recording. So this is useful. Thank you so much for showing up. and let's have a wonderful day, wonderful rest of the year, and you're going to be getting emails and there are going to be more videos. We just will not meet live anymore, but I will try to do this. Alex plus AI podcast combo. thank you so much and you have a wonderful day.