论文标题

审查学习:现实世界的隐私验证保护医疗机构的持续学习

Review learning: Real world validation of privacy preserving continual learning across medical institutions

论文作者

Yoo, Jaesung, Choi, Sunghyuk, Yang, Ye Seul, Kim, Suhyeon, Choi, Jieun, Lim, Dongkyeong, Lim, Yaeji, Joo, Hyung Joon, Kim, Dae Jung, Park, Rae Woong, Yoon, Hyeong-Jin, Kim, Kwangsoo

论文摘要

当深度学习模型在不同的数据集上依次训练时,它常常会忘记从以前的数据中学到的知识,这个问题称为灾难性遗忘。这损害了该模型在不同数据集上的性能,这对于基于转移学习(TL)的隐私深度学习(PPDL)应用至关重要。为了克服这一点,我们介绍了“审查学习”(REVL),这是一种低成本的持续学习算法,用于使用PPDL框架中的电子健康记录(EHR)进行诊断预测。 Revl从模型中生成数据样本,用于查看以前数据集中的知识。使用三个二进制分类EHR数据,进行了六个模拟的机构实验和一个涉及三个医疗机构的现实世界实验,以验证REVL。在现实世界中的数据中,来自106,508例患者的数据,接收器工作曲线下的平均全球面积为0.710,TL的平均全球面积为0.710,TL的平均全球面积为0.655。这些结果表明,Revl保留了以前学习的知识及其在现实世界中的有效性的能力。我们的工作建立了基于机构跨机构的模型转移的PPDL研究的现实管道,并强调了使用私人EHR数据在现实世界医学环境中持续学习的实用性。

When a deep learning model is trained sequentially on different datasets, it often forgets the knowledge learned from previous data, a problem known as catastrophic forgetting. This damages the model's performance on diverse datasets, which is critical in privacy-preserving deep learning (PPDL) applications based on transfer learning (TL). To overcome this, we introduce "review learning" (RevL), a low cost continual learning algorithm for diagnosis prediction using electronic health records (EHR) within a PPDL framework. RevL generates data samples from the model which are used to review knowledge from previous datasets. Six simulated institutional experiments and one real-world experiment involving three medical institutions were conducted to validate RevL, using three binary classification EHR data. In the real-world experiment with data from 106,508 patients, the mean global area under the receiver operating curve was 0.710 for RevL and 0.655 for TL. These results demonstrate RevL's ability to retain previously learned knowledge and its effectiveness in real-world PPDL scenarios. Our work establishes a realistic pipeline for PPDL research based on model transfers across institutions and highlights the practicality of continual learning in real-world medical settings using private EHR data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源