论文标题

深度学习,以区分正常的胸部X光片与异常的概括到看不见的疾病

Deep Learning for Distinguishing Normal versus Abnormal Chest Radiographs and Generalization to Unseen Diseases

论文作者

Nabulsi, Zaid, Sellergren, Andrew, Jamshy, Shahar, Lau, Charles, Santos, Edward, Kiraly, Atilla P., Ye, Wenxing, Yang, Jie, Pilgrim, Rory, Kazemzadeh, Sahar, Yu, Jin, Kalidindi, Sreenivasa Raju, Etemadi, Mozziyar, Garcia-Vicente, Florencia, Melnick, David, Corrado, Greg S., Peng, Lily, Eswaran, Krish, Tse, Daniel, Beladia, Neeral, Liu, Yun, Chen, Po-Hsuan Cameron, Shetty, Shravya

论文摘要

胸部X射线照相(CXR)是最广泛使用的胸部临床成像方式,对于指导心胸病的管理至关重要。特定CXR发现的检测一直是几种人工智能(AI)系统的主要重点。但是,广泛可能的CXR异常使构建特定系统以检测所有可能的状况是不切实际的。在这项工作中,我们开发并评估了一个AI系统,以将CXR分类为正常或异常。为了开发,我们使用了来自印度多城医院网络的248,445名患者的识别数据集。为了评估普遍性,我们使用来自印度,中国和美国的6个国际数据集对系统进行了评估。在这些数据集中,有4个专注于AI未经检测到的疾病:2个患有结核病的数据集和2019年冠状病毒病的2个数据集。我们的结果表明,AI系统对新的患者人群和异常进行了概括。在模拟的工作流程中,AI系统优先考虑异常情况,异常病例的周转时间降低了7-28%。这些结果代表了评估AI是否可以安全地用于在以前看不见异常存在的一般环境中标记案例的重要步骤。

Chest radiography (CXR) is the most widely-used thoracic clinical imaging modality and is crucial for guiding the management of cardiothoracic conditions. The detection of specific CXR findings has been the main focus of several artificial intelligence (AI) systems. However, the wide range of possible CXR abnormalities makes it impractical to build specific systems to detect every possible condition. In this work, we developed and evaluated an AI system to classify CXRs as normal or abnormal. For development, we used a de-identified dataset of 248,445 patients from a multi-city hospital network in India. To assess generalizability, we evaluated our system using 6 international datasets from India, China, and the United States. Of these datasets, 4 focused on diseases that the AI was not trained to detect: 2 datasets with tuberculosis and 2 datasets with coronavirus disease 2019. Our results suggest that the AI system generalizes to new patient populations and abnormalities. In a simulated workflow where the AI system prioritized abnormal cases, the turnaround time for abnormal cases reduced by 7-28%. These results represent an important step towards evaluating whether AI can be safely used to flag cases in a general setting where previously unseen abnormalities exist.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源