论文标题
Histokt:计算病理学中的跨知识转移
HistoKT: Cross Knowledge Transfer in Computational Pathology
论文作者
论文摘要
计算病理学(CPATH)中缺乏通知的数据集阻碍了深度学习技术在对医学图像进行分类时的应用。 %由于病理学家的时间很昂贵,因此数据集策划本质上是困难的。许多CPATH工作流程涉及通过转移学习在各个图像域之间传输知识。当前,大多数转移学习研究遵循以模型为中心的方法,调整网络参数以改善几个数据集的转移结果。在本文中,我们采用一种以数据为中心的方法来解决转移学习问题,并研究组织病理学数据集之间的可推广知识的存在。首先,我们创建一个标准化工作流程,用于汇总现有的组织病理学数据。然后,我们通过在多个组织病理学数据集中培训RESNET18模型来衡量域知识,并在它们之间进行交叉转移以确定先天共享知识的数量和质量。此外,我们使用重量蒸馏来共享模型之间的知识,而无需额外的培训。我们发现很难学习,多级数据集从审过程中受益最大,而两个阶段的学习框架结合了一个大型源域,例如ImageNet,可以更好地利用较小的数据集。此外,我们发现重量蒸馏使经过纯粹的组织病理学特征训练的模型可以超越模型使用外部自然图像数据。
The lack of well-annotated datasets in computational pathology (CPath) obstructs the application of deep learning techniques for classifying medical images. %Since pathologist time is expensive, dataset curation is intrinsically difficult. Many CPath workflows involve transferring learned knowledge between various image domains through transfer learning. Currently, most transfer learning research follows a model-centric approach, tuning network parameters to improve transfer results over few datasets. In this paper, we take a data-centric approach to the transfer learning problem and examine the existence of generalizable knowledge between histopathological datasets. First, we create a standardization workflow for aggregating existing histopathological data. We then measure inter-domain knowledge by training ResNet18 models across multiple histopathological datasets, and cross-transferring between them to determine the quantity and quality of innate shared knowledge. Additionally, we use weight distillation to share knowledge between models without additional training. We find that hard to learn, multi-class datasets benefit most from pretraining, and a two stage learning framework incorporating a large source domain such as ImageNet allows for better utilization of smaller datasets. Furthermore, we find that weight distillation enables models trained on purely histopathological features to outperform models using external natural image data.