使用多模式数据来改善放射学中疾病分类的概括性和解释性

论文标题

使用多模式数据来改善放射学中疾病分类的概括性和解释性

Using Multi-modal Data for Improving Generalizability and Explainability of Disease Classification in Radiology

论文作者

Agnihotri, Pranav, Ketabi, Sara, Khashayar, Namdar, Khalvati, Farzad

论文摘要

放射学诊断的传统数据集往往只提供放射学图像以及放射学报告。但是，放射科医生进行的放射学读数是一个复杂的过程，在阅读过程中，放射科医生的眼睛固定等信息有可能成为可从中学习的宝贵数据源。但是，此类数据的收集既昂贵又耗时。这导致了一个问题，即此类数据是否值得投资收集。本文利用最近发表的Eye Gaze数据集对面对不同级别的输入特征的影响和解释性分类的影响（DL）分类的影响进行详尽的研究，即：放射学图像，放射学报告文本和放射学家眼中的眼中识别数据。我们发现，通过放射学报告自由文本和放射学图像的组合，可以实现X射线图像的最佳分类性能，而眼睛凝视数据没有提供性能的提升。尽管如此，与培训的模型相比，与从事分类和注意力图的模型相比，眼睛凝视数据将作为次要地面真相以及类标签以及类似于批准数据的模型而产生更好的注意力图。

Traditional datasets for the radiological diagnosis tend to only provide the radiology image alongside the radiology report. However, radiology reading as performed by radiologists is a complex process, and information such as the radiologist's eye-fixations over the course of the reading has the potential to be an invaluable data source to learn from. Nonetheless, the collection of such data is expensive and time-consuming. This leads to the question of whether such data is worth the investment to collect. This paper utilizes the recently published Eye-Gaze dataset to perform an exhaustive study on the impact on performance and explainability of deep learning (DL) classification in the face of varying levels of input features, namely: radiology images, radiology report text, and radiologist eye-gaze data. We find that the best classification performance of X-ray images is achieved with a combination of radiology report free-text and radiology image, with the eye-gaze data providing no performance boost. Nonetheless, eye-gaze data serving as secondary ground truth alongside the class label results in highly explainable models that generate better attention maps compared to models trained to do classification and attention map generation without eye-gaze data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题